Speaker and language adaptive training for HMM-based polyglot speech synthesis
نویسنده
چکیده
This paper proposes a novel technique for speaker and language adaptive training for HMM-based statistical parametric polyglot speech synthesis. Language-specific context-dependencies in the system are captured using CAT with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by CMLLR-based transforms. This framework allows multi-speaker/multi-language adaptive training and synthesis to be performed. Experimental results show that the proposed technique achieves better naturalness than both speaker-adaptively trained language-dependent and language-independent models.
منابع مشابه
HMM-based polyglot speech synthesis by speaker and language adaptive training
This paper describes a technique for speaker and language adaptive training (SLAT) for HMM-based polyglot speech synthesis and its evaluations on a multi-lingual speech corpus. The SLAT technique allows multi-speaker/multi-language adaptive training and synthesis to be performed. Experimental results show that the SLAT technique achieves better naturalness than both speaker-adaptively trained l...
متن کاملNew approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer
In this paper we present a new method for synthesizing multiple languages with the same voice, using HMM-based speech synthesis. Our approach, which we call HMM-based polyglot synthesis, consists of mixing speech data from several speakers in different languages, to create a speakerand language-independent (SI) acoustic model. We then adapt the resulting SI model to a specific speaker in order ...
متن کاملNew approach to polyglot synthesis: how to speak any language with anyone’s voice
In this paper we present a new method to synthesize multiple languages with the voice of any arbitrary speaker. We call this method “HMM-based speaker-adaptable polyglot synthesis”. The idea consists in mixing data from several speakers in different languages to create a speakerindependent multilingual acoustic model. By means of MLLR, we can adapt this model to the voice of any given speaker. ...
متن کاملA Study on Speaker-Adaptable Multilingual Synthesis
This thesis introduces a new method for synthesizing multiple languages with the voice of any speaker, so that for example Japanese speech can be synthesized with the voice of a Russian monolingual speaker. This approach is based on the hypothesis that the average voice created by mixing a sufficient number of speakers is the same for all languages, i.e., the average voice is equivalent to a po...
متن کاملCross-lingual voice conversion-based polyglot speech synthesizer for indian languages
A polyglot speech synthesizer, synthesizes speech for any given monolingual or multilingual text, in a single speaker’s voice. In this regard, a polyglot speech corpus is required. It is difficult to find a speaker proficient in multiple languages. Therefore, in the current work, by exploiting the acoustic similarity of phonemes across Indian languages, a polyglot speech corpus is obtained for ...
متن کامل